Image Processing Device And Image Processing Method

ABSTRACT

An image processing apparatus capable of efficient rendering is provided. In an image processing apparatus which renders, in a screen coordinate system, unit figures each constituting the surface of a three-dimensional object to be rendered, a rasterizing unit divides a rendering area corresponding to a screen into multiple unit areas, while a first unit figure is projected onto a screen coordinate system, and outputs the unit areas. A similar process is applied to second and subsequent unit figures so that the multiple unit areas constituting each unit figure are sequentially output. An area divider divides each of the unit areas sequentially output from the rasterizing unit into multiple subareas. An area discarder discards as necessary a subarea obtained by the division by the area divider according to a predetermined rule. An area writer re-merges subareas that survived the discarding process by the area discarder and writes merged areas obtained by re-merge in the memory.

This application is a National Phase Application of InternationalApplication No. PCT/JP2006/302518, filed Feb. 14, 2006, which claims thebenefit under 35 U.S.C. 119 (a-e) of Japanese Application No.2005-047400 filed Feb. 23, 2005, which is herein incorporated byreference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an image processing technology in agraphics processor or the like.

2. Description of the Related Art

With the remarkable development of computer graphics technology andimage processing technology used in computer gaming, digital broadcastetc, in recent years, more detailed display of three-dimensionalgraphics etc. has become possible. In three-dimensional graphicsprocessing, a three-dimensional object having three-dimensionalcoordinate data is projected onto a two-dimensional screen for displayon a display or the like.

Normally, a three-dimensional object is modeled using a combination ofpolygons (e.g., triangles). In projecting a three-dimensional objectonto a two-dimensional screen, rasterization is performed wherein valueslike luminance of pixels inside the polygons are calculated by referringto data on vertices of the polygons.

In rasterization, a linear interpolation method called DigitalDifferential Analyzer (DDA) is used. DDA allows obtaining gradientbetween data for a vertex of a polygon and data for another vertex ofthe polygon in the direction of a side of the polygon. The gradient thusobtained is used to compute data for the polygon sides. Subsequently,pixels inside the polygons are generated by computing the gradient inthe raster-scan direction.

Patent document 1 discloses a technology for improving rendering speedwhereby a pixel group comprising a plurality of pixels included in apredefined rectangular area is dealt with as a unit of processing and aset of pixel groups are transmitted to a processing block in thesubsequent stage.

Patent document 2 teaches an improvement over the technology describedin patent document 1. Processing efficiency is improved by mergingmultiple pixel groups into a single pixel group before transmitting thesame to a processing block in the subsequent stage.

[patent document 1]JP 2000-338959 A

[patent document 2]JP 2003-123082 A

With improvements in the performance of image processing apparatuses forprocessing three-dimensional computer graphics in recent years, the sizeof a polygon forming a three-dimensional object tends to be smaller inorder to render a more detailed three-dimensional object. Consequently,pixel groups including only a limited number of valid pixels areincreasingly more likely to be generated. In this background, moreefficient merge of rectangular areas (pixel groups) is called for.

SUMMARY OF THE INVENTION

A general purpose of the present invention in this background is toprovide an image processing apparatus capable of achieving efficientimage processing.

An embodiment of the present invention pertains to an image processingtechnology which renders, in a screen coordinate system, unit figureseach constituting the surface of a three-dimensional object to berendered. In this image processing technology, a unit figure is dividedinto multiple unit areas on a screen coordinate system so that the unitareas are output. Each of the output unit areas is divided into multiplesubareas. Selected ones of the multiple subareas are discarded accordingto a predetermined rule. Rendering is performed on subareas thatsurvived the discarding process.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments will now be described, by way of example only, withreference to the accompanying drawings which are meant to be exemplary,not limiting, and wherein like elements are numbered alike in severalFigures, in which:

FIG. 1 shows the structure of an image processing apparatus according toa first embodiment.

FIG. 2 shows how pixels are generated by a rasterizing unit.

FIGS. 3A-3C show how multiple stamps are merged under differentconditions.

FIG. 4 shows how a stamp is divided into quads.

FIG. 5 is a flowchart showing how stamps are merged in a mergeraccording to the first embodiment.

FIG. 6 shows how stamps are merged according to the flowchart shown inFIG. 5.

FIG. 7 shows the structure of an area writer.

FIG. 8 shows how quads are distributed by the area writer to outputunits.

FIGS. 9A-9C show how the area writer writes a merged stamp into a cachememory.

FIG. 10 shows another example of merging stamps.

FIG. 11 shows how the quads are distributed to the output units in thearea writer.

FIGS. 12A-12C show how the area writer writes merged stamps shown inFIG. 10 into the cache memory.

FIG. 13 shows how merge proceeds according to a second embodiment whenstamps having different stamp addresses are input in succession.

FIG. 14 shows how the quads are distributed to the output units in thearea writer.

FIGS. 15A-15C show how the area writer writes a merged stamp of FIG. 13into the cache memory 50.

DETAILED DESCRIPTION OF THE INVENTION

The invention will now be described based on preferred embodiments whichdo not intend to limit the scope of the present invention but exemplifythe invention. All of the features and the combinations thereofdescribed in the embodiment are not necessarily essential to theinvention.

Before giving a specific description of embodiments, a summary of theembodiments will be given.

An embodiment of the present invention relates to an image processingapparatus which renders, in a screen coordinate system, unit figureseach constituting the surface of a three-dimensional object to berendered, comprises a rasterizing unit which divides a unit figure intoa plurality of unit areas on the screen coordinate system and outputsthe unit areas; an area divider which divides each of the unit areasoutput from the rasterizing unit into a plurality of subareas; an areadiscarder which discards as necessary a subarea obtained by the divisionby the area divider according to a predetermined rule; and an areawriter which writes a subarea that survived the discarding process bythe area discarder into a memory.

According to this embodiment, the number of subareas transmitted to asubsequent stage can be reduced for efficient rendering, by dividingeach of the unit areas into the subareas and discarding unnecessarysubareas. A unit figure is a polygon or a curved surface patch, by wayof examples.

The area writer may re-merge subareas that survived the discardingprocess and writes merged areas obtained by re-merge in the memory.

Of the subareas that survived the discarding process by the areadiscarder, the area writer may merge subareas derived from unit areashaving the same coordinates in the screen coordinate system before thedivision.

In this case, it is ensured that the subareas included in a merged areaobtained as a result of re-merge are all derived from the unit areas atthe same position. This eliminates the need to refer to the coordinateof the unit area originating each subarea, in writing data related tothe merged area into the memory.

The merged area may have the same size as the unit area. In this case,output data from the rasterizing unit is of the same size as output datafrom the area writer. By ensuring that a unit throughput in therasterizing unit is equal to a unit throughput in writing into thememory, hardware and software can be designed flexibly and compatibilitywith existing systems is enhanced.

The size of the subarea may correspond to a unit throughput in which thearea writer writes the subareas into the memory at a time.

The area writer may refer to information indicating the relativeposition of a subarea in the unit area to which the subarea belongedbefore the division and write the subarea in an address in the memorycorresponding to the information.

The unit area may be a rectangular area, the rasterizing unit may dividea rendering area so that each of the plurality of unit areas includes apixel group, the pixel number in the vertical direction and the pixelnumber in the horizontal direction of a pixel group in a given unit areabeing identical with the corresponding numbers of a pixel group inanother unit area, and the area divider may divide the unit areaincluding the pixel group into a plurality of subareas each including asmall pixel group, the pixel number in the vertical direction and thepixel number in the horizontal direction of a pixel group in a givenunit area being identical with the corresponding numbers of a pixelgroup in another unit area.

Of the plurality of subareas obtained by the division by the areadivider, the area discarder may discard a subarea that does not includeany valid pixels. The term “valid pixel” means a pixel which representsan area embraced by a unit figure whose luminance value, and/or fogvalue etc. had been generated by the rasterizing unit.

By discarding subareas that do not include any valid pixels, frequencyof treating invalid pixels in a processing unit in a subsequent stage isreduced so that efficient rendering is achieved.

Of the subareas that survived the discarding process by the areadiscarder, the area writer may re-merge subareas which do not includevalid pixels at identical coordinates in the screen coordinate systemand may write subareas into the memory in units of merged areas obtainedby re-merge.

By eliminating from re-merge those subareas that have valid pixels atidentical coordinate in the screen coordinate system, loss ofinformation due to overlapping of valid pixels is successfullyprevented.

Of the subareas that survived the discarding process by the areadiscarder, the area writer may merge subareas derived from unit areashaving the same coordinates in the screen coordinate system before thedivision.

In this case, it is ensured that the subareas included in a merged areaobtained as a result of re-merge are all derived from the unit areas atthe same position. This eliminates the need to refer to the coordinateposition of the unit area originating each subarea, in writing datarelated to the merged area into the memory.

The area writer may refer to information indicating the relativeposition of a subarea in the unit area to which the subarea belongedbefore the division so as to write the subarea in an address in thememory corresponding to the information.

Even if the information indicating the relative position of the subareachanges as a result of re-merge, the subarea can be written in a properaddress by referring to the information indicating the relative positionof the subarea in the originating unit area.

The area writer may comprise a memory access unit which writes pixelsincluded in the subarea into the memory in parallel.

Another embodiment of the present invention relates to an imageprocessing method. The image processing method which renders, in ascreen coordinate system, unit figures each constituting the surface ofa three-dimensional object to be rendered, comprises: rasterizing bydividing a unit figure into a plurality of unit areas on the screencoordinate system and outputting the unit areas; dividing each of theunit areas output from the rasterizing into a plurality of subareas;discarding as necessary a subarea obtained by dividing the unit areaaccording to a predetermined rule; and writing a subarea that survivedthe discarding into a memory.

The rasterizing may divide a rendering area so that each of theplurality of unit areas includes a pixel group, the pixel number in thevertical direction and the pixel number in the horizontal direction of apixel group in a given unit area being identical with the correspondingnumbers of a pixel group in another unit area, and the unit areadividing may divide the unit area including the pixel group into aplurality of subareas each including a small pixel group, the pixelnumber in the vertical direction and the pixel number in the horizontaldirection of a pixel group in a given subarea being identical with thecorresponding numbers of a pixel group in another subarea.

Of the plurality of subareas obtained by the division, the discardingmay discard a subarea that does not include any valid pixels.

Still another embodiment of the present invention relates to a renderingmethod. The rendering method which renders, in a screen coordinatesystem, unit figures each constituting the surface of athree-dimensional object to be rendered, comprises: dividing a unitfigure into a plurality of unit areas on the screen coordinate systemand outputting the unit areas; and generating merged areas byretrieving, from a plurality of subareas constituting each of the unitareas output from the dividing, subareas that include valid pixels.

Another rendering method which renders, in a screen coordinate system,unit figure each constituting the surface of a three-dimensional objectto be rendered, comprises: dividing a unit figure into a plurality ofunit areas on the screen coordinate system and outputting the unitareas; and writing subareas, of a plurality of subareas constitutingeach of the unit areas output from the dividing, that include validpixels into a memory in parallel.

Yet another embodiment of the present invention relates to a computerprogram product. The computer program product which renders, in a screencoordinate system, unit figures each constituting the surface of athree-dimensional object to be rendered, comprises: a dividing modulewhich causes a computer to divide a unit figure into a plurality of unitareas on the screen coordinate system and outputting the unit areas; anda merging module which causes a computer to retrieve, from a pluralityof subareas constituting each of the unit areas output from thedividing, subareas that include valid pixels so as to generate mergedareas.

Another computer program product which renders, in a screen coordinatesystem, unit figures each constituting the surface of athree-dimensional object to be rendered, comprises: a dividing modulewhich causes a computer to divide a unit figure into a plurality of unitareas on the screen coordinate system and outputting the unit areas; anda writing module which causes a computer to write subareas, of aplurality of subareas constituting each of the unit areas output fromthe dividing, that include valid pixels into a memory in parallel.

Still another embodiment of the present invention relates to an imageprocessing apparatus which renders, in a screen coordinate system, unitfigures each constituting the surface of a three-dimensional object tobe rendered. The image processing apparatus comprises: a rasterizingunit which divides a rendering area corresponding to a screen intomultiple unit areas, while a unit figure is projected onto a screencoordinate system, and subjects a second and subsequent unit figures toa similar process so as to sequentially output a plurality of unit areasconstituting each unit figure; an area divider which divides each of theunit areas sequentially output from the rasterizing unit into multiplesubareas; an area discarder which discards as necessary a subareaobtained by the division by the area divider according to apredetermined rule; and an area writer which writes a subarea thatsurvived the discarding process in a memory.

Yet another embodiment of the present invention relates to an imageprocessing method which renders, in a screen coordinate system, unitfigures each constituting the surface of a three-dimensional object tobe rendered. The image processing method comprises: rasterizing bydividing a rendering area corresponding to a screen into multiplecongruent unit areas, while a first unit figure is projected onto ascreen coordinate system, and subjecting a second and subsequent unitfigures to a similar process so as to sequentially output a plurality ofunit areas constituting each unit figure; dividing each of the unitareas sequentially output into multiple subareas; discarding asnecessary a subarea obtained by the dividing the unit area according toa predetermined rule; and writing a subarea that survived the discardingin a memory.

Still another embodiment of the present invention relates to a computerprogram product. The computer program, adapted to an image processingapparatus which renders, in a screen coordinate system, unit figureseach constituting the surface of a three-dimensional object to berendered, comprises: rasterizing by causing a computer to divide arendering area corresponding to a screen into multiple unit areas, whilea first unit figure is projected onto a screen coordinate system, and tosubject a second and subsequent unit figures to a similar process so asto sequentially generate a plurality of unit areas constituting eachunit figure; and dividing each of the unit areas sequentially generatedin the rasterizing into multiple subareas; discarding as necessary asubarea obtained by the dividing according to a predetermined rule; andwriting a subarea that survived the discarding in a memory.

Optional combinations of the aforementioned constituting elements, andimplementations of the invention in the form of methods, apparatuses,systems programs may also be practiced as additional modes of thepresent invention.

Several modes of the present invention will be described in detail basedupon embodiments.

First Embodiment

FIG. 1 shows the structure of an image processing apparatus according toa first embodiment. An image processing apparatus 1000 performs imageprocessing such as three-dimensional computer graphics processing. Theimage processing apparatus 1000 projects a unit figure (or unit graphicform) constituting the surface of an object to be rendered onto arendering area in a screen coordinate system and generates pixels so asto render the object, which is ultimately displayed on a display.

The information processing apparatus 1000 comprises a graphics processor200, a main processor 300, a main memory 400 and a graphics memory 120.The blocks are connected to each other via a bus 500.

The information processing apparatus 1000 is connected to a displayapparatus (not shown) which outputs images and graphics generated by theimage processing apparatus 1000. The elements illustrated in FIG. 1 andsubsequent drawings as functional blocks executing respective processesare implemented by hardware including a CPU, a memory and an LSI and bysoftware including a program provided with reservation and managementfunctions and loaded into the memory. Therefore, it will be obvious tothose skilled in the art that the function blocks may be implemented bya variety of non-limiting manners including hardware only, software onlyor a combination of both.

The main processor 300 performs an operation such as three-dimensionalcomputer graphics modeling. The main memory 400 is a storage areaprimarily used by the main processor 300. For example, modeling dataobtained by processing a task related to computer graphics in the mainprocessor 300 is temporarily stored in the main memory 400.

The graphics memory 120 is a memory area dedicated to graphic-relateddata used and managed by the graphics processor 200. In addition to aframe buffer and a z buffer for storing image frame data, the graphicsmemory 120 further comprises areas respectively corresponding to vertexdata, texture data and a color look-up table, which are basic datareferred to when rendering image frame data.

The graphics processor 200 is a block dedicated to the execution of animage. The graphics processor 200 performs a series of steps ofrendering in which it reads from the main memory 400 three-dimensionalmodeling data generated by the main processor 300 and generates imageframe data by performing coordinate transform, hidden surfaceelimination, shading etc.

The graphics processor 200 includes a rasterizer 100, a memory interfaceunit 110 and a display controller 130.

The rasterizer 100 reads three-dimensional modeling data from the mainmemory 400 so as to acquire vertex data of a primitive to be rendered.The rasterizer 100 performs a view transform in which a primitive in athree-dimensional space is transformed by projecting the primitive ontoa screen coordinate system into a figure on a rendering plane. Further,the rasterizer 100 performs a rasterizing process in which the figure onthe rendering plane is scanned in the horizontal direction of therendering plane so as to transform, row by row, the figure intoquantized pixels. The rasterizer 100 decomposes the primitive intopixels and computes pixel information for each pixel. The pixelinformation includes RGB color values, an a value indicatingtransparency and a Z value indicating depth from a view point.

The rasterizer 100 generates a pixel area of a predetermined size alongthe scan line and outputs the generated area to the memory interfaceunit 110. A pixel group output from the rasterizer 100 is temporarilystacked in a queue. The memory interface unit 110 sequentially writespixel groups stacked in the queue to the graphics memory 120.

The memory interface 110 writes pixels in the graphics memory 120 orreads a frame buffer from the graphics memory 120 in accordance with aninstruction from the display controller 130. The display controller 130outputs image data read from the graphics memory 120 to the displaydevice.

A detailed description will now be given of the structure and operationof the rasterizer 100.

The rasterizer 100 includes a rasterizing unit 10, a merger 600 and acache memory 50.

The rasterizing unit 10 projects a triangle (a unit figure constitutingthe surface of an object rendered) onto a rendering area in a screencoordinate system. The rasterizing unit 10 divides the rendering areainto multiple unit areas, generates pixels included in the unit areasand outputs pixel groups for each of the respective unit areas. Therasterizing unit 10 sequentially generates pixel groups for therespective unit areas for output, repeating the step of generation formultiple unit figures (triangles).

FIG. 2 shows how pixels are generated by the rasterizing unit 10. FIG. 2shows a part of a rendering area 2000 in a screen coordinate system. Atriangle 12 is a projection of a triangle (a unit figure constitutingthe surface of a three-dimensional object) onto a rendering area. Therasterizing unit 10 generates pixels inside the polygon by using theDigital Differential Analyze (DDA) method, based upon the coordinatevalues, colors, fog values and texture coordinates of three vertices 12a, 12 b and 12 c of the triangle 12.

The rendering area is divided into multiple congruent unit areas 14.

The rasterizing unit 10 sequentially generates, for each unit area,pixel information (color, fog value etc.) for an area embraced by thetriangle 12 and outputs the pixel information thus generated for therespective unit areas. Hereinafter, pixels for which pixel informationis generated by the rasterizing unit 10 will be referred to as validpixels.

In this embodiment, the unit area 14 having a predetermined sizeincludes 4×4 pixels. The 4×4 pixel group included in the unit area 14will be referred to as a stamp STP. Each stamp STP maintains positioninformation indicating the position of the associated unit area in thescreen coordinate system. Hereinafter, the position information of astamp will be referred to as a stamp address ADDstp.

The rasterizing unit 10 sequentially generates pixel information foreach of the stamps STP and sequentially outputs stamps STP that includevalid pixels to the merger 600.

The merger 600 merges multiple stamps STP output from the rasterizingunit 10 and writes each of the stamps thus merged in the cache memory 50in the subsequent stage. A stamp obtained by merging the multiple stampsSTP in the merger 600 will be referred to as a final stamp STPf. Therasterizer 100 and the memory interface unit 110 inside the graphicsprocessor exchange data with each other, using a stamp STP as a unit ofprocessing.

Merge performed in the merger 600 will be described. The merger 600merges stamps having the identical stamp address ADDstp.

Merge of the stamps is performed using the following two parameters. Afirst parameter represents the size of a unit referred to fordetermination as to whether merge is possible. A second parameterrepresents the size of a unit of actual merge. FIGS. 3A-3C show howmultiple stamps are merged under different conditions. A stamp STPmrepresents a merged stamp STPm obtained by merging two stamps STP1 andSTP2.

Hereinafter, a small pixel group comprising 2×2 pixels obtained bydividing a stamp will be referred to as a quad QD. A quad that includesa valid pixel will be referred to as a valid quad.

FIG. 3A shows merge performed when both two parameters are “pixels”. Thestamp STP1 and the stamp STP2 of FIG. 3A have the identical stampaddress. These stamps STP do not include valid pixels that occur atidentical positions. Therefore, upon pixel-by-pixel determination onoverlapping (occurrence at identical positions), the stamps will bemerged without any problem.

FIG. 3B shows merge performed when both two parameters are “quads (QD)”.In merge shown in FIG. 3B, a determination on occurrence at identicalpositions is made quad by quad. As a result, quads at top right of thestamp STP1 and the stamp STP2 are both valid and occur at identicalpositions. Therefore, the two stamps are not merged. The two stamps STP1and STP2 are output unmodified.

FIG. 3C shows merge according to this embodiment, wherein the firstparameter is “pixel” and the second parameter is “quad (QD)”. In mergeof FIG. 3C, a determination on occurrence at identical positions is madepixel by pixel. It is found that quads QD at top right of the stamp STP1and the stamp STP2 are both valid and occur at identical positions, butvalid pixels are not to found to overlap each other upon pixel-by-pixeldetermination. Additionally, the total number of valid quads is 4 orfewer. Thus, the stamps can be merged into a single stamp STPm. In themerged stamp STPm obtained as a result of quad-based merge, the positionof a quad may be different from the original position. This is addressedby relocation described later.

As described, in the merge according to this embodiment shown in FIG.3C, determination on occurrence at identical positions is made pixel bypixel and merge is performed quad by quad. As a result, efficiency inmerging the stamp STP1 and the stamp STP2 shown in FIG. 3C is equal tothat of FIG. 3A. When implemented in actual image processing, efficiencyin merge is most superior in FIG. 3A, followed by FIG. 3C and FIG. 3B.

Considering the cost required to implement an image processing apparatusfor performing such merge, reduction in the size defined by the firstparameter, which represents a unit referred to for determination of thepossibility of merge, will only require a change in the unit ofdetermination. As such, it does not result in a considerable increase inimplementation cost. In contrast, reduction in the size defined by thesecond parameter, which represents a unit of actual merge, will requireincreased cost for implementing a unit for accessing a memory. Thus, interms of implementation cost, the arrangement of FIG. 3B is mostfavorable, followed by FIG. 3C and FIG. 3A.

Therefore, the method of merge according to this embodiment shown inFIG. 3C is superior in balance between efficiency in merge andimplementation cost. A description will now be given of the structure ofthe merger 600 for achieving the merge according to this embodiment.

The merger 600 includes an area divider 20, an area discarder 30 and anarea writer 40.

The area divider 20 divides stamps STP sequentially output from therasterizing unit 10 into multiple small congruent small pixel groups. Inthis embodiment, the area divider 20 divides a stamp STP of 4×4 pixelsinto congruent quads QD0-QD3 of 2×2 pixels and outputs the quads to thearea discarder 30 in the subsequent stage.

FIG. 4 shows how the stamp STP is divided into the quads QD0-QD3. Eachof the quads QD0-QD3 maintains position information indicating therelative position of the quads in the stamp, as a quad address ADDqd. Itwill be assumed that the quad addresses ADDqd are 00, 01, 10 and 11,starting with the top left quad and ending with the bottom right quad.In FIG. 4, the quad addresses are parenthesized.

The area divider 20 outputs multiple pixels in units of quads QD to thearea discarder 30 in the subsequent stage.

The area discarder 30 retains selected quads QD selected from themultiple quads QD output from the area divider 20 and discards the rest.The area discarder 30 outputs the quads QD that survived the discardingprocess to the area writer 40 in the subsequent stage.

The area writer 40 writes the quads QD that survived the discardingprocess by the area discarder 30 to the cache memory 50. In thisprocess, the quads that survived the discarding process by the areadiscarder 30 are re-merged and are used to reconstruct a stampcomprising 4×4 pixels. A stamp re-merged by the area writer 40 will bereferred to as a merged stamp STPm.

The area writer 40 writes the merged stamp STPm output from the areadiscarder 30 in the cache memory 50 as a unit of processing. The mergedstamp STPm written in the cache memory 50 is stacked in a queue. Thememory interface 110 sequentially writes the merged stamp STPm thusstacked into the graphics memory 120. Since the merged stamp STPm is notnecessarily identical with the final stamp STPf, the area writer 40relocates quads within the merged stamp STPm as described later.

A detailed description of processes in the area divider 20, the areadiscarder 30 and the aera writer 40 will now be given.

Of those stamps STP sequentially output from the rasterizing unit 10,stamps that are targets of division by the area divider 20 will bereferred to as source stamps STPs. A stamp which is obtained bysequentially and temporarily dividing source stamps STPs into quads,discarding unnecessary quads QD and re-merging stamps will be referredto as a merged stamp STPm. A stamp used in the process of merging sourcestamps STPs and maintained to be subject to merge will be referred to asa target stamp STPt.

The merger 600 merges by sequentially writing quads QDs of the sourcestamps STPs into empty quads of the target stamp STPt.

FIG. 5 is a flowchart showing how stamps STP are merged in the merger600 according to this embodiment.

Prior to starting a merge process, the area writer 40 initializes atarget stamp STPt (S100).

The area writer 40 then initializes a variable j to 0 (S102). Thevariable j indicates the place, in the sequence of quads in the targetstamp STPt, up to which the quad is valid. For example, if j=0, itindicates that no valid quad is written in the target stamp STPt. Ifj=4, it indicates that the target stamp STPt is filled with quads.

The area divider 20 then acquires a source stamp STPs (S104). Acquiringof a source stamp STPs is achieved by reading a stamp STP output fromthe rasterizing unit 10.

The stamp address ADDstp of the target stamp STPt and that of the sourcestamp STPs are then compared (S106). If the addresses of the two stampsdiffer (N in S106), merge is not performed and the current target stampSTPt is output to the area writer 40 in the subsequent stage as a mergedstamp STPm (S130). The target stamp STPt is then initialized (S132). Inassociation with the initialization of the target stamp STPt, thevariable j is set to 0 (S134). Control is then returned to S108.

If the addresses of the two stamps are identical (Y in S106), the areadivider 20 divides the source stamp STPs into four quads QDs0-QDs3(S108).

A variable i is then initialized to 0 (S10). The variable i indicateswhere, in the sequence of quads in the source stamp STPs subject tomerge, the quad being processed is positioned. For example, if i=0, thequad QDs0 of the source stamp STPs is being processed.

The area discarder 30 then determines whether the ith quad QDsi of thesource stamp STPs includes any valid pixel (S112). If the quad QDsi doesnot include any valid pixel (N in S112), the quad QDsi is discarded(S160). The variable i is incremented by 1 to proceed to the next quadQDsi (S126).

If the quad QDsi includes a valid pixel (Y in S112), the area writer 40determines whether any of the valid pixels in the quad QDsi occurs atthe identical position as a valid pixel of the target stamp STPt (S114).The determination is made by determining whether the quad QDsi and thetarget stamp STPt include valid pixels at the same coordinate positionin the screen coordinate system.

If there are valid pixels occurring at identical positions (Y in S114),merge is not performed and the current target stamp STPt is output as amerge stamp STPm. The target stamp STPt is initialized and the variablej is returned to 0 (S140-S144). Control then proceeds to the mergeprocess described below (S116-S128).

If there are no valid pixels occurring at identical positions (N inS114), control also proceeds to the merge process described below(S116-S128).

The quad QDsi of the source stamp STPs is written in the jth quad QDtjof the target stamp STPt (S116). The variable j is then incremented by 1in order to shift the position of writing in the target stamp STPt(S118).

If j=4, it means that the quads QDs of the source stamps are merged withall quads QDt0-QDt3 of the target stamp STPt. Therefore, the currenttarget stamp STPt is output as a merged stamp STPm (S150). The targetstamp STPt is then initialized (S152) and the variable j is set to 0(S154). The variable i is incremented by 1 to proceed to the next quadQDsi (S126).

If j≠4, it means that there is an empty quad in the target stamp STPt.The target stamp STPt is maintained as it is and the variable i isincremented by 1 to proceed to the next quad QDsi (S126).

A determination on the variable i is made. If i=4 (Y in S128), it meansthat merge is completed for all quads QDsi belonging to the source stampSTPs. Thereupon, the next source stamp STPs is acquired (S104).

If i≠4, control is returned to S112 to proceed to the next quad QDsi.

As described, the merger 600 merges if the stamp addresses of the targetstamp STPt and that of the source stamp STPm are identical and if thereare no valid pixels occurring at identical pixel positions.

FIG. 6 shows how stamps are merged according to the flowchart shown inFIG. 5. We consider a case where the stamps STP1-STP4 are sequentiallyinput to the area divider 20. It will be assumed that the target stampSTPt is empty.

The stamp STP1 will be the source stamp STPs and is divided by the areadivider 20 into the four quads QDs0-QDs3. Since the quads QDs1-QDs3 donot include any valid pixel, they are discarded by the area discarder30.

Only the quad QDs0 that survived the discarding process by the areadiscarder 30 is output to the area writer 40 and is written in the quadQDt0 of the target stamp STPt.

Subsequently, the stamp STP2 will be the source stamp STPs and isdivided by the area divider 20 into the four quads QDs0-QDs3. The areadiscarder 30 discards the quads QDs1-QDs3. The area writer 40 writes thequad QDs0 in the quad QDt1 of the target stamp STPt.

A similar process is performed for the stamps STP2 and STP3. Quads QD0having the quad address ADDqd of (00) and derived from the stampsSTP1-STP4 are written in the quads QDt0-QDt3 of the target stamp STPt soas to generate a merged stamp STPm.

The quads QDm0-QDm3 included in the merged stamp STPm maintain theirquad addresses in the originating stamps STP1-STP4 as positioninformation. The merged stamp STPm is output to the area writer 40 alongwith the position information.

FIG. 7 shows the structure of the area writer 40.

The area writer 40 includes a memory access unit 44 and a distributor42. The memory access unit 44 is provided with output units 46 a-46 dfor simultaneously writing four pixels included in a quad into the cachememory 50 in parallel.

The distributor 42 distributes pixels constituting a quad QD to theoutput units 46 a-46 d. The top left pixel in a quad is distributed tothe output unit 46 a, the top right pixel to the output unit 46 b, thebottom left pixel to the output unit 46 c, and the bottom right pixel tothe output unit 46 d.

By repeating the operation of simultaneously writing the four pixelsinto the cache memory 50 in parallel four times in units of quads, thearea writer 40 completes a process of writing into one merged stamp.

FIG. 8 shows how the quads QDm0-QDm3 are distributed by the area writer40 to the output units 46 a-46 d. The distributor 42 divides the mergedstamp STPm into the four quads QDm0-QDm3 and sequentially distributesthem to the output units 46 a-46 d at times T1, T2, T3 and T4.

At time T1, the distributor 42 decomposes the quad QDm0 into pixels anddistributes the pixels to the four output units 46 a-46 d. Each outputunit writes the pixel derived from decomposition into the cache memory50. Subsequently, the quads QDm1-QDm3 are decomposed into pixels attimes T2, T3 and T4 and are sequentially output. An output of a mergedstamp STPm is produced in each unit period comprising times T1-T4.

FIGS. 9A-9C show how the area writer 40 writes the merged stamp STPminto the cache memory 50. FIG. 9A shows the cache memory 50; FIG. 9Bshows how writing of quads by the area writer 40 proceeds; and FIG. 9Cshows the final stamp STPf obtained upon completion of the writing.

Since the merged stamp STPm is not necessarily identical with the finalstamp STPf, the area writer 40 relocates quads within the merged stampSTPm.

The relocation is done as described below.

At time T1, the area writer 40 writes pixels included in the quad QDm0of the merged stamp STPm into the cache memory 50. The area writer 40refers to the quad address ADDqd of the quad QDm0. Since the quadaddress ADDqd of the quad QDm0 is 00, the quad QDm0 is written in anaddress 00 of the cache memory 50.

Subsequently, the quad QDm1 is written at time T2. Since the quadaddress of the quad QDm1 is also 00, the quad QDm1 is also written inthe address 00 of the cache memory 50. Similarly, the quads QDm2 andQDm3 are written in the address 00 at times T3 and T4, respectively.

As a result of all quads QDm0-QDm3 being written in the quad addressADDqd=00, the final stamp STPf in which the stamps STP1-STP4 of FIG. 6are merged is generated.

FIG. 10 shows how other stamps STP5-STP8 are merged. These stamps haveidentical stamp address and there are no valid pixels occurring atidentical positions.

First, the stamp STP5 is divided into four quads QDs0-QDs3. Since onlythe quad QDs0 includes a valid pixel, the quad QDs0 is written in thequad QDt0 of the target stamp STPt. The quads QDs1-QDs3 are discarded.

Subsequently, the stamp STP6 will be the source stamp STPs so that thequads QDs0-QDs3 are subject to merge. As a result, the quads QDs0 andQDs2 are written in the quads QDt1 and QDt2 of the target stamp STpt,and the quads QDs1 and QDs3 are discarded.

Subsequently, the stamp STP7 will be the next source stamp STPs. Thequad QDs0 of the stamp STP7 is written in the quad QDt3 of the targetstamp STpt. In this state, all quads in the target stamp STPt are validso that the target stamp STPt is output as a merged stamp STPm1. Thetarget stamp STPt is then initialized.

Subsequently, the quad QDs2 of the source stamp STPs is written in thequad QDt0 of a new target stamp STPt. The quads QDs1 and QDs3 arediscarded.

The stamp STP8 will be the next source stamp. Since the quads QDs0-QDs3all include valid pixels, the quads are written in the target stamp STPtwithout being discarded.

The quads QDs0-QDs2 are written in the quads QDt1-QDt3 of the targetstamp STPt. In this state, all quads in the target stamp STPt are validso that the target stamp STPt is output as a merged stamp STPm2. Thetarget stamp STPt is initialized again.

The quad QDs3 of the source stamp STPs is written in the quad QDt0 ofthe new target stamp STPt. The quad QDt0 is then merged with quads fromother source stamps STPs. The resultant target stamp is output as amerged stamp STPm3.

FIG. 11 shows how the quads QDm0-QDm3 are distributed to the outputunits 46 a-46 d in the area writer 40.

The area writer 40 outputs the four quads QDm0-QDm3 within the mergedstamp STPm1 in the stated order at times T1, T2, T3 and T4,respectively.

The distributor 42 of the area writer 40 distributes the four pixelsincluded in each quad to the respective output units. The merged stampSTPm1 is written in a period comprising times T1-T4; the merged stampSTPm2 is written in a period comprising times T5-T8; and the mergedstamp STPm3 is written in a period comprising times T9-T12.

The quads QDm0-QDm3 constituting each of the merged stamps STPm1-STPm3are output along with their addresses in the originating stampsSTP5-STP7.

FIGS. 12A-12C show how the area writer 40 writes the merged stampsSTPm1-STPm3 shown in FIG. 10 into the cache memory 50. At scheduledtimes, the area writer 40 refers to the addresses of the quads andwrites the quads into the cache memory 50, relocating them.

As a result, as shown in FIG. 12C, the final stamp STPf in which thestamps STP5-STP8 are merged is written in the cache memory 50. Thememory interface unit 110 in the subsequent stage writes the final stampSTPf in the graphics memory 120.

Thus, according to the image processing apparatus 1000 of thisembodiment, stamps, which are pixel groups representing units ofprocessing output from the rasterizing unit 10, are divided into quads,which are smaller pixel groups. Unnecessary quads are discarded andquads that survived are merged. This can increase the number of validpixels including in merged stamps obtained as a result of merge,resulting in an efficient rendering process.

It will be particularly noted that the nature of a triangle strip islikely to generate a series of stamps which have an identical stampaddress and in which valid pixels do not occur at identical positions.Accordingly, by merging a series of stamps efficiently, the number ofvalid pixels can be increased.

The image processing apparatus according to this embodiment temporarilydivides the stamps into quads for pixel-based determination as towhether merge is possible, while the actual process of merge isquad-based. Accordingly, a favorable balance is achieved betweenimplementation cost and processing efficiency.

Further, the merged stamp STPm produced by the area discarder 30 iscorrected by the area writer 40 into a final stamp STPf, which isultimately output, through localized, intra-stamp relocation. Relocationperformed by the area writer 40 is implemented by control of addressesin the cache memory 50 and relocation using cross bars. Addition ormodification in hardware is required only in the memory access unit. Theinventive approach can easily be implemented in conventional imageprocessing apparatuses.

Second Embodiment

The second embodiment represents an improvement in which the stamp mergeprocess described with reference to the first embodiment is expanded. Inthe first embodiment, quads are subject to merge only when the sourcestamp STPs has an identical stamp address as the target stamp STPm. Thecharacteristic feature of the second embodiment is that merge of stampshaving different stamp addresses is permitted. Hereinafter, mergewherein merge of stamps having different stamp addresses is permittedwill be referred to as an expanded merge.

FIG. 13 shows how merge proceeds when stamps having different stampaddresses are input in succession. Stamps STP10-STP13 are sequentiallyinput from the rasterizing unit 10 to the area divider 20. It will beassumed that the stamp address ADDstp of the stamps STP10 and STP11 is0001 and the stamp address ADDstp of the stamps STP12 and STP13 is 0010.None of the valid pixels in the stamps STP10-STP13 occurs at identicalpositions.

The area divider 20 uses the stamp STP10 as the source stamp STPs anddivides the stamp STP10 into quads QDs0-QDs3. The quad QDs0 is writtenin the quad QDt0 of the target stamp STPt, and the quads QDs1-QDs3 arediscarded.

Subsequently, the stamp STP11 will be the source stamp STPs and isdivided into the quads QDs0-QDs3. The quad QDs0 is written in the quadQDt1 of the target stamp STPt.

Subsequently, the stamp STP12 will be the source stamp and is dividedinto the quads QDs0-QDs3. The quad QDs2 is written in the quad QDt2 ofthe target stamp STPt.

Subsequently, the stamp STP13 will be the source stamp and is dividedinto the quads QDs0-QDs3. The quad QDs2 is written in the quad QDt3 ofthe target stamp STPt.

The quads not written in the target stamp STPt are all discarded.

In this state, all quads in the target stamp STPt are valid so that thetarget stamp STPt is output as a merged stamp STPm, and the target stampis initialized.

Quads QDm0-QDm3 included in the merged stamp STPm maintain their quadaddresses ADDqd in the originating stamps STP10-STP13 along with thestamp addresses ADDstp of the originating stamps STP10-STP13.

FIG. 14 shows how the quads QDm0-QDm3 are distributed to the outputunits 46 a-46 d in the area writer 40. The memory access unit 44 of thearea discarder 30 outputs the four quads QDm0-QDm3 within the mergedstamp STPm in the stated order at times T1, T2, T3 and T4, respectively.At time T1, the quad QDm0 is decomposed into pixels, which aresimultaneously output in parallel.

Subsequently, the quads QDm1-QDm3 are decomposed into pixels and aresequentially output at times T2, T3 and T4, respectively.

FIGS. 15A-15C show how the area writer 40 writes the merged stamp STPmof FIG. 13 into the cache memory 50.

As shown in FIG. 15A, the area writer 40 refers to the stamp addressADDstp in addition to the quad address ADDqd, in writing the quads QDm.For example, the quad QDm0 input at time T1 has a stamp address of 0001and a quad address of 00, the quad QDm0 is written in an associatedaddress in the cache memory 50. A similar thing is true of the quadsQDm1-QDm3.

As a result, as shown in FIG. 12C, the final stamps STPf1 and STPf2 inwhich the stamps STP10-STP13 of FIG. 10 are merged are written into thecache memory 50. The final stamp STPf1 is a result of merging the stampsSTP10 and STP11. The final stamp STPf2 is a result of merging the stampsSTP12 and STP13. The final stamps STPf1 and STPf2 are written inpositions with the stamp addresses ADDstp of 0001 and 0010.

By performing expanded merge according to the embodiment in which stampshaving different addresses are temporarily merged, the number of timesthat the area writer 40 accesses the cache memory 50 can be reduced.

According to expanded merge, merge can be performed even if stampaddresses differ. Therefore, the number of valid quads within a mergedstamp can be increased and efficiency in image processing can beimproved.

The embodiment described is only illustrative in nature and variousvariations in constituting elements and processes involved are possible.Those skilled in the art would readily appreciate that such variationsare also within the scope of the present invention.

In the described embodiments, a stamp comprising 4×4 pixels forms apixel group, and a quad comprising 2×2 pixels forms a small pixel group.Other configurations are also possible. For example, an array of 1×4pixels may form a small pixel group, or an array of 4×1 pixels may forma quad.

The number of pixels included in a quad is preferably, but notnecessarily, equal to the number of pixels simultaneously written in thecache memory 50 in parallel. The number of pixels included in a quad maybe two, in a configuration in which four pixels are simultaneouslywritten in the cache memory 50 in parallel. The size of a stamp and thesize of a quad may appropriately be determined by considering thehardware cost and the cost required in implementing the process.

While the preferred embodiments of the present invention have beendescribed using specific terms, such description is for illustrativepurposes only, and it is to be understood that changes and variationsmay be made without departing from the spirit or scope of the appendedclaims.

1. An image processing apparatus which renders, in a screen coordinatesystem, unit figures each constituting the surface of athree-dimensional object to be rendered, comprising: a rasterizing unitwhich divides a unit figure into a plurality of unit areas on the screencoordinate system and outputs the unit areas; an area divider whichdivides each of the unit areas output from the rasterizing unit into aplurality of subareas; an area discarder which discards as necessary asubarea obtained by the division by the area divider according to apredetermined rule; and an area writer which writes a subarea thatsurvived the discarding process by the area discarder into a memory. 2.The image processing apparatus according to claim 1, wherein the areawriter re-merges subareas that survived the discarding process andwrites merged areas obtained by re-merges in the memory.
 3. The imageprocessing apparatus according to claim 2, wherein each of the mergedareas has the same size as the unit area.
 4. The image processingapparatus according to claim 1, wherein the size of the subareacorresponds to a unit throughput in which the area writer writes thesubareas into the memory at a time.
 5. The image processing apparatusaccording to claim 2, wherein, of the subareas that survived thediscarding process by the area discarder, the area writer mergessubareas derived from unit areas having the same coordinates in thescreen coordinate system before the division.
 6. The image processingapparatus according to claim 2, wherein the area writer refers toinformation indicating the relative position of a subarea in the unitarea to which the subarea belonged before the division and writes thesubarea in an address in the memory corresponding to the information. 7.The image processing apparatus according to claim 1, wherein the unitarea is a rectangular area, the rasterizing unit divides a renderingarea so that each of the plurality of unit areas includes a pixel group,the pixel number in the vertical direction and the pixel number in thehorizontal direction of a pixel group in a given unit area beingidentical with the corresponding numbers of a pixel group in anotherunit area, and the area divider divides the unit area including thepixel group into a plurality of subareas each including a small pixelgroup, the pixel number in the vertical direction and the pixel numberin the horizontal direction of a pixel group in a given subarea beingidentical with the corresponding numbers of a pixel group in anothersubarea.
 8. The image processing apparatus according to claim 7,wherein, of the plurality of subareas obtained by the division by thearea divider, the area discarder discards a subarea that does notinclude any valid pixels.
 9. The image processing apparatus according toclaim 7, wherein, of the subareas that survived the discarding processby the area discarder, the area writer re-merges subareas which do notinclude valid pixels at identical coordinates in the screen coordinatesystem and writes merged areas obtained by re-merge in the memory. 10.The image processing apparatus according to claim 9, wherein, of thesubareas that survived the discarding process by the area discarder, thearea writer merges subareas derived from unit areas having the samecoordinates in the screen coordinate system before the division.
 11. Theimage processing apparatus according to claim 9, wherein the area writerrefers to information indicating the relative position of a subarea inthe unit area to which the subarea belonged before the division so as towrite the subarea in an address in the memory corresponding to theinformation.
 12. The image processing apparatus according to claim 7,wherein the area writer comprises a memory access unit which writespixels included in the subarea into the memory in parallel.
 13. An imageprocessing method which renders, in a screen coordinate system, unitfigures each constituting the surface of a three-dimensional object tobe rendered, comprising: rasterizing by dividing a unit figure into aplurality of unit areas on the screen coordinate system and outputtingthe unit areas; dividing each of the unit areas output from therasterizing into a plurality of subareas; discarding as necessary asubarea obtained by dividing the unit area according to a predeterminedrule; and writing a subarea that survived the discarding into a memory.14. The image processing method according to claim 13, wherein therasterizing divides a rendering area so that each of the plurality ofunit areas includes a pixel group, the pixel number in the verticaldirection and the pixel number in the horizontal direction of a pixelgroup in a given unit area being identical with the correspondingnumbers of a pixel group in another unit area, and the unit areadividing divides the unit area including the pixel group into aplurality of subareas each including a small pixel group, the pixelnumber in the vertical direction and the pixel number in the horizontaldirection of a pixel group in a given subarea being identical with thecorresponding numbers of a pixel group in another subarea.
 15. The imageprocessing method according to claim 14, wherein, of the plurality ofsubareas obtained by the division, the discarding discards a subareathat does not include any valid pixels.
 16. An image processing methodwhich renders, in a screen coordinate system, unit figures eachconstituting the surface of a three-dimensional object to be rendered,comprising: dividing a unit figure into a plurality of unit areas on thescreen coordinate system and outputting the unit areas; and generatingmerged areas by retrieving, from a plurality of subareas constitutingeach of the unit areas output from the dividing, subareas that includevalid pixels.
 17. An image processing method which renders, in a screencoordinate system, unit figures each constituting the surface of athree-dimensional object to be rendered, comprising: dividing a unitfigure into a plurality of unit areas on the screen coordinate systemand outputting the unit areas; and writing subareas, of a plurality ofsubareas constituting each of the unit areas output from the dividing,that include valid pixels into a memory in parallel.
 18. A computerprogram product which renders, in a screen coordinate system, unitfigures each constituting the surface of a three-dimensional object tobe rendered, comprising: a dividing module which causes a computer todivide a unit figure into a plurality of unit areas on the screencoordinate system and outputting the unit areas; and a merging modulewhich causes a computer to retrieve, from a plurality of subareasconstituting each of the unit areas output from the dividing, subareasthat include valid pixels so as to generate merged areas.
 19. A computerprogram product which renders, in a screen coordinate system, unitfigures each constituting the surface of a three-dimensional object tobe rendered, comprising: a dividing module which causes a computer todivide a unit figure into a plurality of unit areas on the screencoordinate system and outputting the unit areas; and a writing modulewhich causes a computer to write subareas, of a plurality of subareasconstituting each of the unit areas output from the dividing, thatinclude valid pixels into a memory in parallel.